Predictive Correlation Screening: Application to Two-stage Predictor Design in High Dimension
نویسندگان
چکیده
We introduce a new approach to variable selection, called Predictive Correlation Screening, for predictor design. Predictive Correlation Screening (PCS) implements false positive control on the selected variables, is well suited to small sample sizes, and is scalable to high dimensions. We establish asymptotic bounds for Familywise Error Rate (FWER), and resultant mean square error of a linear predictor on the selected variables. We apply Predictive Correlation Screening to the following two-stage predictor design problem. An experimenter wants to learn a multivariate predictor of gene expressions based on successive biological samples assayed on mRNA arrays. She assays the whole genome on a few samples and from these assays she selects a small number of variables using Predictive Correlation Screening. To reduce assay cost, she subsequently assays only the selected variables on the remaining samples, to learn the predictor coefficients. We show superiority of Predictive Correlation Screening relative to LASSO and correlation learning (sometimes popularly referred to in the literature as marginal regression or simple thresholding) in terms of performance and computational complexity.
منابع مشابه
Explaining the Relationship of Social trust on the Citizenship Ethics of, High School Students in Bushehr City
the aim of this study is explaining the relationship between social trust with citizenship ethic. The study population consisted of all high school students in Bushehr city that among this population, a sample of 360 students were selected by multistage random sampling method. the research tools included multi dimension comparison of recognized social support, questionnaire of social trust and ...
متن کاملSoluble Adiponectin is a New Predictor for Cardiovascular Complications in Patients with End Stage Renal Disease
Mortality due to cardiovascular complications (CVC) in patients with end stage renal disease (ESRD) is 20 fold higher than in general population. Adiponectin (ADPN) hormone from adipose tissues accumulation in serum is attributed to reduced renal clearance. The aim of this study was to investigate the possible role of ADPN as a predictor of CVC in adult patients with ESRD on hemodialysis (HD), ...
متن کاملExtracting Predictor Variables to Construct Breast Cancer Survivability Model with Class Imbalance Problem
Application of data mining methods as a decision support system has a great benefit to predict survival of new patients. It also has a great potential for health researchers to investigate the relationship between risk factors and cancer survival. But due to the imbalanced nature of datasets associated with breast cancer survival, the accuracy of survival prognosis models is a challenging issue...
متن کاملThe Sensitivity, Specificity and Predictive Values of Snellen Chart Compared to the Diagnostic Test in Amblyopia Screening Program in Iran
Introduction Amblyopia is a leading cause of visual impairment in both childhood and adult populations. Our aim in this study was to assess the epidemiological characteristics of the amblyopia screening program in Iran. Materials and Methods A cross-sectional study was done on a randomly selected sample of 4,636 Iranian children who were referred to screening program in 2013 were participated i...
متن کاملPredictive value of epidermal growth factor (EGF) and laminin-5 for clinicopathologic oral squamous cell carcinoma (OSCC) staging and grading in Iranian population
Abstract Background: Squamous cell carcinoma (SCC) constitutes the main oral malignancy . Parallel to better understanding of molecular and genetic patterns of tumor behavior, more precise correlation of tumor markers such as Epidermal Growth Factor (EGF) and Laminin-5 are sought to estimate macroscopic and microscopic tumor status . Methods: We conducted a cross-sectional study collect...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013